How to fix your Indian accent using AI

by Punya Mishra | Sunday, February 19, 2023

Featured image design © Punya Mishra (background image courtsey PxHere)

There are many meanings to the phrase “having a voice.” It can mean whether you are present and acknowledged within a space – but most literally it means what you say and how you speak? And sticking to the literal meaning, what does it mean if your voice is changed? Without your permission? Does it change the metaphorical “voice” – your presence in a given space? Does it change your sense of belonging? Does it change your identity?

If all this sounds super abstract, please read the post below about how AI technologies can quite literally change your voice and accent… and a personal experience I had with one of these technologies.

Sorry to bother you is a hilariously dark, somewhat surreal, satirical movie that very creatively explores issues of race, technological change in an age of rampant capitalism. Set in a slightly askew version of Oakland, CA the central conceit of the film is that the main character in the film, who is Black, finds success in his telemarketing career only when he speaks with a white accent. You can see one example in the embedded clip from the film:

Clearly, this is a satire, a pointed one but satire none-the-less.

Imagine my surprise when learned about Sanas, a Bay Area startup that offers a service to call-center companies (who are predominantly in Asia) that will change Asian-sounding accents, on the fly, to make them sound more “neutral.” Of course, in this case, neutral means “White American.” And the AI that powers it does all this in real time. As the website says, “Sound Local, Globally… with the power of effortless real-time accent translation.”

You can check out this new technology of “real-time accent translation” on their website. You can listen to a call center employee speaking in a standard Indian accent, and with the push of a button change the accent to what can only be called a standard “White” accent.

Just for fun, I played with their sample for a bit and in the MP3 embedded below I randomly switched between the “Indian” and the “American” accent. (Both words in quotes because there is no ONE Indian accent, just as there is no ONE American accent).

It would be funny if it weren’t so real and tone-deaf in its marketing pitch. The prose on the website urges you to “Change the World, Not Your Self,” and to “Make yourself heard,” because it is “Your Voice. Your Choice.” (As if the call center employee has any choice in the matter.)

And why do this? Of course, it all about empowerment. As the website says:

Empower team members around the globe to confidently communicate in their voice, no matter who they’re talking to or where they’re calling from, [to] Unlock deeper connections, build trust and affinity, expand diversity, and protect your team’s mental health.” [It is] a step towards empowering individuals, advancing equality, and deepening empathy (emphasis added).

Wow! That’s a lot!

And of course, they have the numbers to back it up. “Accent matching,” the website assures us, “can improve understanding by 31% and customer satisfaction by 21%.”

Irony dies a slow agonizing death in a video embedded on their site where they respond to the hypothetical question: “What does the future sound like?” by responding “It sounds like, YOU.”

Except that that is exactly the opposite of what the software does. It makes you sound like someone who is quite definitely NOT like you! I mean that’s the whole point of the software.

Now we know these AI tools are coming (just see the whole brouhaha about ChatGPT3) and that they will be used in interesting and strange ways, ways that will be offensive and problematic, biased in ways that are both subtle and deliberate. But in all fairness to this company, at least they are upfront about it. Their goal is to take accents that are potentially problematic for Americans and convert them to ones that are not. That’s what their business plan is – and clearly there are buyers.

What is more insidious is where there are tools out there that claim to do one thing but in actuality do something quite different.

Allow me to tell you the story of Adobe Podcast: Enhance, and what it did to my voice.

I am originally from India—so no surprise that I have an Indian accent. If you want to get even more specific I have a pretty typical North-Indian accent (since that is where I spent most of my formative years). My voice, my accent, is who I am. That is part of my identity. It makes me who I am. I have never seen it as being a handicap in any shape or form.

But I guess, Adobe thinks otherwise.

The software giant recently released a beta software called Adobe Podcast and one of the tools in their toolkit is called Enhance. The idea is that you can drop your slightly crappy voice recording (with background noise and other distractions) onto their site and the AI software would magically clean it up, making it sound as if it were recorded in a professional studio. There is even a demo on the site that you can listen to for yourself. And, to be fair, it does sound pretty good.

What I did not anticipate however, what it would do to my Indian accent. My friend and colleague Sean Leahy informed me that they while testing the Enhance software they found something strange was happening. When given a clip recorded in my voice (from an episode of the Learning Futures Podcast) the final output went beyond just cleaning up the background noises.

It subtly changed my accent. It took off the “edges” of my Indian accent and making it little higher in pitch and making me sounding more “American.” This was not something I had asked the software to do. It was just something that was built in.

And in some ways what Adobe Enhance is doing is more insidious – than Sanas. At least Sanas is upfront about its motives – while the AI behind Adobe Enhance does its magic while claiming to be neutral.

That said, I am honestly not a good judge of my voice – because I genuinely don’t like listening to myself recorded. So I grabbed a couple of clips (regular and enhanced) I asked a few friends about what they thought. According to Melissa Warr:

Adobe totally changed your voice–it doesn’t sound like you, it sounds fake… It really seems to take out your identity–since I know you well, it is taking away your voice.

You don’t have to take my (or Melissa’s) word for it. You can judge for yourself by listening to both versions below:

First, the standard recording:

And next the “enhanced” version:

What do you think? Am I imagining things? Or did the software change my accent, ever so slightly?

This “feature” was not advertised anywhere. It was not something I had asked for. Just something built into the system.

There are just so many things to unpack here – so I am not even going to try.

Coming full circle to the movie we started with. What was satire in Sorry to Bother You, is now reality — as in the Sanas software, and appears to be embedded (without any clear indication of doing so) in the AI powered Adobe Podcast Enhance.

What can satire do when reality itself is so messed up.

I am reminded of another movie—Don’t Look Up —which was initially written as a satire, an absurd tragi-comedy on our inability to respond to the climate crisis. However, by the time the movie came out we were in the living through the collectively muddled response to the COVID19 pandemic and the movie now spoke to this new reality. As David Sims said in a conversation in The Atlantic:

It does reflect what’s difficult about satire right now. How do you find ludicrousness in our ludicrous reality? How do you heighten and amuse when everything already feels so heightened all the time?

Sometimes parody and real-life blend in ways that boggle one’s mind.

← Previous | ChatGPT as a blurry jpeg of the web

Next | Bringing Design to Education: IDC Talks →

A few randomly selected blog posts…

A Silver Lining side conversation with S. Giridhar:

Jun 27, 2020

S. Giridhar (Giri), Chief Operating Officer of Azim Premji University (APU) and I had a chance to chat for a Silver Lining for Learning side conversation. Giri is a good friend and we connect at multiple levels. We both went to the same undergraduate institution (BITS...

How artists work

Aug 10, 2008

An interesting (and growing) collection of "habits, rituals and small (and occasionally big) methods people and teams use to get their work done. And in the specific anecdotes and the way people describe their own relationship to their own work." Kind of cool and...

Meeting Sanjaya Mishra

May 28, 2008

Yesterday I met with Sanjaya Mishra, a scholar and researcher in the area of distance education. Sanjaya and I first met at the Vidyakash conference a bunch of years ago and we clicked almost immediately. I always enjoy meeting up with him when I am in Delhi, though...

The Brahmin connection

May 29, 2008

A funny (and yet somewhat sad) story ... So I am in Nagpur airport waiting for my flight, which had been delayed, and I struck up a conversation with a young man there, as one is wont to do. We of course started by complaining about the airlines, then moved on to...

Squaring a circle on Pi day!

Mar 14, 2018

Pie upon reflection is nothing but 3.14!A new version of a design I had created a year ago.Original idea stolen from the Interwebs Since it is Pi(e) day, I thought it would be fun to share another design I had created a while ago in response to one of the...

Episteme6 @ Mumbai: 2 presentations

Jan 28, 2016

This past December I was at the epiSTEME 6 conference in Mumbai. It was jointly organized by the Homi Bhaba Center for Science Education, TIFR and the Interdisciplinary Program in Educational Technology, IIT Bombay. I presented two papers there, oneabout...

London Underground Map

Nov 8, 2008

One of my favorite pieces of design is the London Underground Map. It has been replicated all over the world - from Mumbai to Tokyo. Leigh Wolf just sent me a link to a BBC 4 video made in 1987 about this map. Check it out here Here is a link to the Wikipedia page...

Creativity at Wake Forest

Mar 20, 2009

I presented yesterday at a conference a Wake Forest University titled: Creativity: Worlds in the Making. I was part of a panel that included Robert and Michele Root-Bernstein and Todd Siler. More details about the panel and links to my presentation can be found below....

A tangent, a line & a circle, another Math-Poem

Jan 13, 2010

A tangent, a line and a circle A math poem Image credit: chrstphre (on Flickr) A point outside a circle, shoots out two lines one heading for the center the other more feline smoothly kisses the curve That delicate swerve of the ball and then, abruptly turns to the...

1 Comment

Steve Salik on February 22, 2023 at 10:13 am

Oh Hell No! This is just another step in the process of turning the world into a cup of premade vanilla pudding. The blandest and most uninteresting landscape I can think of.
Reply